Research
Security News
Malicious npm Packages Inject SSH Backdoors via Typosquatted Libraries
Socket’s threat research team has detected six malicious npm packages typosquatting popular libraries to insert SSH backdoors.
commonmark
Advanced tools
The commonmark npm package is a JavaScript implementation of the CommonMark specification, which is a strongly defined, highly compatible specification of Markdown. It allows you to parse and render Markdown content in a consistent and predictable manner.
Parsing Markdown to AST
This feature allows you to parse Markdown text into an Abstract Syntax Tree (AST). The AST can then be manipulated or traversed for various purposes.
const commonmark = require('commonmark');
const reader = new commonmark.Parser();
const parsed = reader.parse('# Hello World');
console.log(parsed);
Rendering AST to HTML
This feature allows you to render the parsed AST back into HTML. This is useful for converting Markdown content into HTML for web pages.
const commonmark = require('commonmark');
const reader = new commonmark.Parser();
const writer = new commonmark.HtmlRenderer();
const parsed = reader.parse('# Hello World');
const result = writer.render(parsed);
console.log(result);
Customizing the Renderer
This feature allows you to customize the rendering process by extending the HtmlRenderer class. You can override methods to change how specific elements are rendered.
const commonmark = require('commonmark');
const reader = new commonmark.Parser();
class CustomRenderer extends commonmark.HtmlRenderer {
// Override methods to customize rendering
text(node) {
this.lit('<span>' + node.literal + '</span>');
}
}
const writer = new CustomRenderer();
const parsed = reader.parse('# Hello World');
const result = writer.render(parsed);
console.log(result);
Marked is a fast, lightweight Markdown parser and compiler. It is designed to be simple to use and highly customizable. Compared to commonmark, marked is known for its speed and flexibility, but it may not adhere as strictly to the CommonMark specification.
Markdown-it is a Markdown parser that is both fast and extensible. It supports plugins and offers a high degree of customization. Unlike commonmark, markdown-it provides more features out of the box, such as syntax highlighting and support for custom containers.
Remark is a Markdown processor powered by plugins. It can parse, transform, and compile Markdown. Remark is highly modular and allows for extensive customization through its plugin system. It offers more flexibility compared to commonmark but may require more setup.
CommonMark is a rationalized version of Markdown syntax, with a spec and BSD-licensed reference implementations in C and JavaScript.
For more information, see http://commonmark.org.
This repository contains the JavaScript reference implementation. It provides a library with functions for parsing CommonMark documents to an abstract syntax tree (AST), manipulating the AST, and rendering the document to HTML or to an XML representation of the AST.
To play with this library without installing it, see the live dingus at http://try.commonmark.org/.
You can install the library using npm
:
npm install commonmark
This package includes the commonmark library and a
command-line executable, commonmark
.
For client-side use, fetch the latest from
https://raw.githubusercontent.com/commonmark/commonmark.js/master/dist/commonmark.js,
or bower install commonmark
.
Make sure to fetch dependencies with:
npm install
To build standalone JavaScript files (dist/commonmark.js
and
dist/commonmark.min.js
):
make dist
To run tests for the JavaScript library:
make test
To run benchmarks against some other JavaScript converters:
make bench
To start an interactive dingus that you can use to try out the library:
make dingus
Instead of converting Markdown directly to HTML, as most converters
do, commonmark.js
parses Markdown to an AST (abstract syntax tree),
and then renders this AST as HTML. This opens up the possibility of
manipulating the AST between parsing and rendering. For example, one
could transform emphasis into ALL CAPS.
Here's a basic usage example:
var reader = new commonmark.Parser();
var writer = new commonmark.HtmlRenderer();
var parsed = reader.parse("Hello *world*"); // parsed is a 'Node' tree
// transform parsed if you like...
var result = writer.render(parsed); // result is a String
The constructors for Parser
and HtmlRenderer
take an optional
options
parameter:
var reader = new commonmark.Parser({smart: true});
var writer = new commonmark.HtmlRenderer({sourcepos: true});
Parser
currently supports the following:
smart
: if true
, straight quotes will be made curly, --
will
be changed to an en dash, ---
will be changed to an em dash, and
...
will be changed to ellipses.Both HtmlRenderer
and XmlRenderer
(see below) support these options:
sourcepos
: if true
, source position information for block-level
elements will be rendered in the data-sourcepos
attribute (for
HTML) or the sourcepos
attribute (for XML).safe
: if true
, raw HTML will not be passed through to HTML
output (it will be replaced by comments), and potentially unsafe
URLs in links and images (those beginning with javascript:
,
vbscript:
, file:
, and with a few exceptions data:
) will
be replaced with empty strings.softbreak
: specify raw string to be used for a softbreak.esc
: specify a function to be used to escape strings. Its
first argument is the string to be escaped, the second argument
is a boolean indicating whether to preserves entities in that
string.For example, to make soft breaks render as hard breaks in HTML:
var writer = new commonmark.HtmlRenderer({softbreak: "<br />"});
To make them render as spaces:
var writer = new commonmark.HtmlRenderer({softbreak: " "});
XmlRenderer
serves as an alternative to HtmlRenderer
and
will produce an XML representation of the AST:
var writer = new commonmark.XmlRenderer({sourcepos: true});
The parser returns a Node. The following public properties are defined (those marked "read-only" have only a getter, not a setter):
type
(read-only): a String, one of
text
, softbreak
, linebreak
, emph
, strong
,
html_inline
, link
, image
, code
, document
, paragraph
,
block_quote
, item
, list
, heading
, code_block
,
html_block
, thematic_break
.firstChild
(read-only): a Node or null.lastChild
(read-only): a Node or null.next
(read-only): a Node or null.prev
(read-only): a Node or null.parent
(read-only): a Node or null.sourcepos
(read-only): an Array with the following form:
[[startline, startcolumn], [endline, endcolumn]]
.isContainer
(read-only): true
if the Node can contain other
Nodes as children.literal
: the literal String content of the node or null.destination
: link or image destination (String) or null.title
: link or image title (String) or null.info
: fenced code block info string (String) or null.level
: heading level (Number).listType
: a String, either Bullet
or Ordered
.listTight
: true
if list is tight.listStart
: a Number, the starting number of an ordered list.listDelimiter
: a String, either )
or .
for an ordered list.onEnter
, onExit
: Strings, used only for custom_block
or
custom_inline
.Nodes have the following public methods:
appendChild(child)
: Append a Node child
to the end of the
Node's children.prependChild(child)
: Prepend a Node child
to the
beginning of the Node's children.unlink()
: Remove the Node from the tree, severing its links
with siblings and parents, and closing up gaps as needed.insertAfter(sibling)
: Insert a Node sibling
after the Node.insertBefore(sibling)
: Insert a Node sibling
before the Node.walker()
: Returns a NodeWalker that can be used to iterate through
the Node tree rooted in the Node.The NodeWalker returned by walker()
has two methods:
next()
: Returns an object with properties entering
(a boolean,
which is true
when we enter a Node from a parent or sibling, and
false
when we reenter it from a child). Returns null
when
we have finished walking the tree.resumeAt(node, entering)
: Resets the iterator to resume at the
specified node and setting for entering
. (Normally this isn't
needed unless you do destructive updates to the Node tree.)Here is an example of the use of a NodeWalker to iterate through
the tree, making transformations. This simple example converts
the contents of all text
nodes to ALL CAPS:
var walker = parsed.walker();
var event, node;
while ((event = walker.next())) {
node = event.node;
if (event.entering && node.type === 'text') {
node.literal = node.literal.toUpperCase();
}
}
This more complex example converts emphasis to ALL CAPS:
var walker = parsed.walker();
var event, node;
var inEmph = false;
while ((event = walker.next())) {
node = event.node;
if (node.type === 'emph') {
if (event.entering) {
inEmph = true;
} else {
inEmph = false;
// add Emph node's children as siblings
while (node.firstChild) {
node.insertBefore(node.firstChild);
}
// remove the empty Emph node
node.unlink()
}
} else if (inEmph && node.type === 'text') {
node.literal = node.literal.toUpperCase();
}
}
Exercises for the reader: write a transform to
html_inline
and html_block
nodes).HtmlBlock
containing the highlighted code.The command line executable parses CommonMark input from the specified files, or from stdin if no files are specified, and renders the result to stdout as HTML. If multiple input files are specified, their contents are concatenated before parsing, with newlines between them.
commonmark inputfile.md > outputfile.html
commonmark intro.md chapter1.md chapter2.md > book.html
Use commonmark --help
to get a summary of options.
The library does not attempt to sanitize link attributes or
raw HTML. If you use this library in applications that accept
untrusted user input, you should either enable the safe
option
(see above) or run the output through an HTML sanitizer to protect against
XSS attacks.
Performance is excellent, roughly on par with marked
. On a benchmark
converting an 11 MB Markdown file built by concatenating the Markdown
sources of all localizations of the first edition of
Pro Git by Scott
Chacon, the command-line tool, commonmark
is just a bit slower than
the C program discount
, roughly ten times faster than PHP Markdown,
a hundred times faster than Python Markdown, and more than
a thousand times faster than Markdown.pl
.
Here are some focused benchmarks of four JavaScript libraries (using versions available on 24 Jan 2015). They test performance on different kinds of Markdown texts. (Most of these samples are taken from the markdown-it repository.) Results show a ratio of ops/second (higher is better) against showdown (which is usually the slowest implementation). Versions: showdown 1.3.0, marked 0.3.5, commonmark.js 0.22.1, markdown-it 5.0.2, node 5.3.0. Hardware: 1.6GHz Intel Core i5, Mac OSX.
Sample | showdown | commonmark | marked | markdown-it |
---|---|---|---|---|
README.md | 1 | 3.6 | 3.1 | 3.9 |
block-bq-flat.md | 1 | 4.8 | 4.9 | 4.9 |
block-bq-nested.md | 1 | 11.9 | 6.8 | 10.7 |
block-code.md | 1 | 4.7 | 12.1 | 23.0 |
block-fences.md | 1 | 6.2 | 21.2 | 19.1 |
block-heading.md | 1 | 5.0 | 4.8 | 6.5 |
block-hr.md | 1 | 3.5 | 3.3 | 3.5 |
block-html.md | 1 | 2.1 | 0.9 | 3.8 |
block-lheading.md | 1 | 5.1 | 4.9 | 3.9 |
block-list-flat.md | 1 | 4.7 | 4.4 | 7.4 |
block-list-nested.md | 1 | 9.5 | 7.8 | 17.6 |
block-ref-flat.md | 1 | 0.8 | 0.5 | 0.6 |
block-ref-nested.md | 1 | 0.7 | 0.6 | 0.9 |
inline-autolink.md | 1 | 2.3 | 3.4 | 2.5 |
inline-backticks.md | 1 | 7.6 | 5.3 | 8.2 |
inline-em-flat.md | 1 | 1.5 | 1.1 | 1.6 |
inline-em-nested.md | 1 | 1.8 | 1.3 | 1.7 |
inline-em-worst.md | 1 | 2.4 | 1.5 | 2.5 |
inline-entity.md | 1 | 2.0 | 3.8 | 2.7 |
inline-escape.md | 1 | 2.2 | 1.4 | 5.0 |
inline-html.md | 1 | 2.9 | 3.7 | 3.3 |
inline-links-flat.md | 1 | 2.7 | 2.7 | 2.2 |
inline-links-nested.md | 1 | 1.4 | 0.5 | 0.5 |
inline-newlines.md | 1 | 2.3 | 2.0 | 3.5 |
lorem1.md | 1 | 6.0 | 2.9 | 3.3 |
rawtabs.md | 1 | 4.6 | 3.9 | 6.7 |
To generate this table:
make bench-detailed
John MacFarlane wrote the first version of the JavaScript implementation. The block parsing algorithm was worked out together with David Greenspan. Kārlis Gaņģis helped work out a better parsing algorithm for links and emphasis, eliminating several worst-case performance issues. Vitaly Puzrin has offered much good advice about optimization and other issues.
FAQs
a strongly specified, highly compatible variant of Markdown
The npm package commonmark receives a total of 436,356 weekly downloads. As such, commonmark popularity was classified as popular.
We found that commonmark demonstrated a healthy version release cadence and project activity because the last version was released less than a year ago. It has 1 open source maintainer collaborating on the project.
Did you know?
Socket for GitHub automatically highlights issues in each pull request and monitors the health of all your open source dependencies. Discover the contents of your packages and block harmful activity before you install or update your dependencies.
Research
Security News
Socket’s threat research team has detected six malicious npm packages typosquatting popular libraries to insert SSH backdoors.
Security News
MITRE's 2024 CWE Top 25 highlights critical software vulnerabilities like XSS, SQL Injection, and CSRF, reflecting shifts due to a refined ranking methodology.
Security News
In this segment of the Risky Business podcast, Feross Aboukhadijeh and Patrick Gray discuss the challenges of tracking malware discovered in open source softare.